SPY-TEC: An efficient indexing method for similarity search in high-dimensional data spaces

نویسندگان

  • Dong-Ho Lee
  • Hyoung-Joo Kim
چکیده

Most of all index structures based on the R-tree have failed to support ecient indexing mechanisms for similarity search in high-dimensional data spaces. This is due to the fact that most of the index structures commonly use balanced split strategy in order to guarantee storage utilization and the shape of queries for similarity search is a hypersphere in high-dimensional spaces. In this paper, we propose the Spherical Pyramid-Technique (SPY-TEC), an ecient indexing method for similarity search in high-dimensional data space. The SPY-TEC is based on a special partitioning strategy, which is to divide the d-dimensional data space ®rst into 2d spherical pyramids, and then cut the single spherical pyramid into several spherical slices. This partition provides a transformation of d-dimensional space into one-dimensional space as the Pyramid-Technique [14] does. Thus, we are able to use a B ‡-tree to manage the transformed one-dimensional data. We also propose the algorithms to process hyperspherical range queries on the data space partitioned by this partitioning strategy. Finally, we show that the SPY-TEC clearly outperforms other related techniques including the Pyramid-Technique in processing hyperspherical range queries through various experiments using synthetic and real data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

An Efficient Technique for Nearest-Neighbor Query Processing on the SPY-TEC

—The SPY-TEC (Spherical Pyramid-Technique) was proposed as a new indexing method for high-dimensional data spaces using a special partitioning strategy that divides a d-dimensional data space into 2d spherical pyramids. In the SPY-TEC, an efficient algorithm for processing hyperspherical range queries was introduced with a special partitioning strategy. However, the technique for processing k-n...

متن کامل

A fast content-based indexing and retrieval technique by the shape information in large image database

In this paper, we present an ecient content-based image retrieval (CBIR) system which employs the shape information of images to facilitate the retrieval process. For ecient feature extraction, we extract the shape feature of images automatically using edge detection and wavelet transform which is widely used in digital signal processing and image compression. To facilitate speedy retrieval, ...

متن کامل

یک روش مبتنی بر خوشه‌بندی سلسله‌مراتبی تقسیم‌کننده جهت شاخص‌گذاری اطلاعات تصویری

It is conventional to use multi-dimensional indexing structures to accelerate search operations in content-based image retrieval systems. Many efforts have been done in order to develop multi-dimensional indexing structures so far. In most practical applications of image retrieval, high-dimensional feature vectors are required, but current multi-dimensional indexing structures lose their effici...

متن کامل

Retrieval of Optimal Subspace Clusters Set for an Effective Similarity Search in a High-Dimensional Spaces

High dimensional data is often analysed resorting to its distribution properties in subspaces. Subspace clustering is a powerfull method for elicication of high dimensional data features. The result of subspace clustering can be an essential base for building indexing structures and further data search. However, a high number of subspaces and data instances can conceal a high number of subspace...

متن کامل

An efficient nearest neighbor search in high-dimensional data spaces

Similarity search in multimedia databases requires an efficient support of nearest neighbor search on a large set of high-dimensional points. A technique applied for similarity search in multimedia databases is to transform important properties of the multimedia objects into points of a high-dimensional feature space. The feature space is usually indexed using a multidimensional index structure...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:
  • Data Knowl. Eng.

دوره 34  شماره 

صفحات  -

تاریخ انتشار 2000